Mon. Not. R. Astron. Soc. 000, 000-000 (0000) Printed 5 February 2008 



(MN M£X style file v2.2) 



Error analysis in cross-correlation of sky maps: 
application to the ISW detection 

Anna Cabre, Pablo Fosalba, Enrique Gaztanaga & Marc Manera 

Institut de Ciencies de I'Espai, IEEC-CSIC, Campus UAB, F. de Ciencies, Torre C5 par-2, Barcelona 08193, Spain 
5 February 2008 

ABSTRACT 

Constraining cosmological parameters from measurements of the Integrated Sachs- 
Wolfe effect requires developing robust and accurate methods for computing statistical 
errors in the cross-correlation between maps. This paper presents a detailed compar- 
ison of such error estimation applied to the case of cross-correlation of Cosmic Mi- 
crowave Background (CMB) and large-scale structure data. We compare theoretical 
models for error estimation with montecarlo simulations where both the galaxy and 
the CMB maps vary around a fiducial auto-correlation and cross-correlation model 
which agrees well with the current concordance ACDM cosmology. Our analysis com- 
pares estimators both in harmonic and configuration (or real) space, quantifies the 
accuracy of the error analysis and discuss the impact of partial sky survey area and 
the choice of input fiducial model on dark-energy constraints. We show that purely 
analytic approaches yield accurate errors even in surveys that cover only 10% of the 
sky and that parameter constraints strongly depend on the fiducial model employed. 
Alternatively, we discuss the advantages and limitations of error estimators that can 
be directly applied to data. In particular, we show that errors and covariances from 
the Jack-Knife method agree well with the theoretical approaches and simulations. 
We also introduce a novel method in real space that is computationally efficient and 
can be applied to real data and realistic survey geometries. Finally, we present a num- 
ber of new findings and prescriptions that can be useful for analysis of real data and 
forecasts, and present a critical summary of the analyses done to date. 



1 INTRODUCTION 

The ISW effect (Sachs & Wolfe, 1967) has emerged as a 
new and powerful tool to probe our universe on the largest 
physical scales, testing deviations from General Relativity 
and the existence of dark-energy independent of other clas- 
sical probes (e.g, Crittenden and Turok 1996; Bean and 
Dore 2004; Lue et al 2004; Cooray et al. 2004; Garriga et 
al. 2004; Song et al. 2006). Recently, a number of groups 
have obtained the first detections of the ISW effect by cross- 
correlating low redshift tracers of the large scale structure 
(LSS) with the cosmic variance limited cosmic microwave 
background (CMB) maps obtained by WMAP (e.g, Boughn 
and Crittenden 2004; Nolta et al. 2004; Fosalba and Gaztanaga 
2004; Fosalba et al. 2003; Scranton et al. 2003; Afshordi et 
al 2004, Rassat et al 2006, Cabre et al. 2006). Although cur- 
rent detections are only claimed at the 2-4 a level, all anal- 
yses coherently favor a flat ACDM model that is consistent 
with WMAP observations (Spergel et al. 2006). Moreover, 



the redshift evolution of the measured signal already pro- 
vides first constraints on alternative cosmological scenarios 
( Corasaniti et al. 2005; Gaztanaga et al. 2006). 

However, sample variance from the primary CMB aniso- 
tropics limits the ability with which one can detect CMB- 
LSS correlations. For the observationally favored flat ACDM model, 
even an optimal measurement of the cross-correlation could 
only achieve a signal-to-noise ratio of ~ 10 (Crittenden and 
Turok 1996; Peiris and Spergel 2000; Afshordi 2004, see also 
Fig. 17 below). Given the low significance level of ISW de- 
tections, a good understanding of the systematic and statis- 
tical errors is crucial to optimally exploit CMB-LSS corre- 
lation data that will be collected in future surveys such as 
PLANCK, DES, SPT, LSST, etc., for cosmological purposes 
(see e.g., Pogosian et al. 2005). Recent work has focused on 
the impact of known systematics on cross-correlation mea- 
surements (Boughn and Crittenden 2005; Afshordi 2004), 
however no detailed analysis has been carried out to assess 
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Armed with this machinery, we address some of the open 
questions that have been raised in previous work on the 
ISW effect detection: how accurate are JK errors? are er- 
ror estimators different due to the input theoretical models 
or the data themselves? how many Montecarlo simulations 
should one use to get accurate results? can we safely neglect 
the cross-correlation signal in the simulations? do harmonic 
and real space methods yield compatible results? What is 
the uncertainty associated to the different error estimates? 

The methodology and results presented here should largely 
apply to other cross-correlation analyses of different sky 
maps such as galaxy-galaxy or lensing-galaxy studies. 

This paper is organized as follows: Section 2 presents the 
Montecarlo and the theoretical methods used to compute the 
errors for the galaxy-temperature cross-correlation signal. 
Sections 3 & 4 shows a comparison between the normalized 
covariances and diagonal errors from different estimators. 
The impact of the choice of error estimator on cosmological 
parameters is discussed in Section 5. Finally, in Section 6 we 



how different error estimates compare or what is the accu- 
racy delivered by each of them. So far, most of the published 
analyses have implemented one specific error estimator (pri- 
marily in real or configuration space) without justifying the 
choice of that particular estimator or quantifying its degree 
of accuracy. 

In particular, most of the groups that first claimed ISW 
detections (Boughn and Crittenden 2004; Nolta et al. 2004; 
Fosalba and Gaztanaga 2004; Fosalba et al. 2003; Scranton 
et al. 2003, Rassat et a 2006) estimated errors from CMB 
Gaussian montecarlo (MC) simulations alone. In this ap- 
proach statistical errors are obtained from the dispersion of 
the cross-correlation between the CMB sky realizations with 
a (single) fixed observed map tracing the nearby large-scale 
structure. This estimator is expected to be reasonably ac- 
curate as long as the cross-correlation signal is weak and 
the CMB autocorrelation dominates the total variance of 
the estimator. We shall call this error estimator MCI (see 
below). 

Fosalba, Gaztanaga & Castander (2003), Fosalba & Gaztanaga 411111111 ™ 26 our main results and conclusions 
(2004), also used Jack-knife (JK) errors. They found that 
the JK errors perform well as compared to the MCI esti- 
mator, but the JK error from the real data seems up to a 
factor of two smaller (on sub-degree scales) than the JK er- 
ror estimated from simulations. This discrepancy arise from 
the fact that the fiducial theoretical model used in the MCI 
simulations does not match the best fit to the data (see con- 
clusions) . 

Afshordi et al (2004) criticize the MCI and JK estima- 
tors and implement a purely theoretical Gaussian estimator 
in harmonic space (which we shall call TH bellow). How- 
ever, they did not show why their choice of estimator should 
be more optimal or validate it with simulations. This criti- 
cism to the JK approach has been spread in the literature 
without any critical assessment. Vielva, Martinez-Gonzalez 
& Tucci (2006) also point out the apparent limitations of 
the JK method and adopted the MCI simulations instead. 
However they seem to find that the signal-to-noise of their 
measurement depends on the statistical method used. 

Padmanabham etal (2005) use Fisher matrix approach 
and MCl-type simulations to validate and calibrate their 
errors. They also claim that JK errors tend to underesti- 
mate errors because of the small number of uncorrelated JK 
patches on the sky, but they provide no proof of that. 

Giannantonio et al (2006), use errors from MC simula- 
tions that follow the method put forward by Boughn et al 
(1997). In their work the error estimator is built from pairs 
of simulation maps (of the CMB and large-scale structure 
fields) including the predicted auto and cross-correlation. 
This is the estimator we shall name MC2 below. They point 
out that their results are consistent with what is obtained 
from the simpler MCI estimator. 

In this paper we develop a systematic approach to com- 
pare different error estimators in cross-correlation analyses. 



2 METHODS 

We consider four methods to estimate errors. The first one is 
based on Montecarlo (MC) simulations of the pair of maps 
we want to correlate. We consider two variants: MC2, where 
pairs are correlated with a given fiducial model and MCI, 
where one map in each pair is fixed and no cross-correlation 
signal is included. The next two methods rely on theoretical 
estimation. We will use a popular harmonic space predic- 
tion, that we shall call TH (Theory in Harmonic space). We 
will also introduce a novel error estimator that is an ana- 
lytic function of the auto and cross-correlation of the fields 
in real space that we shall call TC (Theory in Configura- 
tion space). Finally, we will estimate Jack- Knife (JK) errors 
which uses sub-regions of the actual data map to calculate 
the dispersion in our estimator. 

Once we have errors estimated in one space, it is also 
possible to translate them, through Legendre transforma- 
tion, into the complementary space. We shall make a clear 
distinction between the method for the error calculation 
(i.e., MC, TH, TC or JK), and the estimator onto which the 
errors are propagated: i.e. either w(8) or Ci. For example, 
TH — w means errors in w(0) propagated from theoretical 
errors originally computed in harmonic space. This notation 
is summarized in Table 1. 

In all cases (except for the JK) we are assuming Gaus- 
sian statistics. In principle, it is also possible to do all this 
with non- Gaussian statistics but this requires particular non- 
Gaussian models, which are currently not well motivated by 
observations. Ultimately, our focus here is on the compari- 
son of different methods for a well defined set of reasonable 
assumptions. 
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notation 

Table 1. Notation used in this paper. 



TC 


Theory in Configuration space 


TH 


Theory in Harmonic space 


MC 


Montecarlo simulations 


MC2 


MC of the 2 fields, with correlation signal 


MCI 


MC of f field alone (CMB), no correlation signal 


JK 


Jack-knife errors 


MC2-w 


errors in wtg from MC2 simulations 


MCl-w 


errors in wtg from MCf simulations 


MC2-CV 


errors in Ci from MC2 simulations 


TH-w 


errors in wtg from TH theory 


TC-w 


errors in wtg from TC theory 


TH-CV 


errors in Cg, from TH theory 


JK-w 


errors in wtg from JK simulations 



2.1 Montecarlo Simulations (MC) 

We have run 1000 Montecarlo (MC) simulation pairs of 
the CMB temperature anisotropy and the dark-matter over- 
density field, including its cross-correlation, following the 
approach presented in Boughn et al. 97 (see Eq.2 below). 
These simulations are produced using the synfast routine 
of the Healpix package 1 . We assume that both fields are 
Gaussian: this is a good approximation for the CMB field 
on the largest scales (i.e few degrees on the sky), which are 
the relevant scales for the ISW effect. However, the mat- 
ter density field is weakly non-linear on these scales (eg 
see Bernardeau etal 2002) and therefore non-Gaussian, i.e 
it has non-vanishing higher-order moments. Therefore our 
simulations are realistic as long as non-Gaussianity does not 
significantly alter the CMB-matter cross-correlation and its 
associated errors. 2 We take galaxies to be fair tracers of the 
underlying spatial distribution of the matter density field on 
large-scales: we assume that a simple linear bias model re- 
lates both fields, so that wgg = b 2 WMM and wtg = b wtm- 
Therefore, in what follows, we make no difference between 
matter and galaxies in our analysis (other than b), without 
loss of generality. 

Decomposing our simulated fields on the sphere, we 

have 



(1) 



where ai m are the amplitudes of the scalar field projected 
on the spherical harmonic basis Y lrn . In our simulations, 
the aim's are given by linear combinations of unit variance 



random Gaussian fields rp , 

n TG 

1f>l,tm ~ 



T 
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a tr, 
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for the CMB temperature (T) and galaxy over-density (G) 
fields, respectively, and Cf Y is the (cross) angular power 
spectrum of the X and Y fields. 

Our simulations assume a model that is broadly speak- 
ing in agreement with current observations, although the 
precise choice of parameter values is not critical for the pur- 
pose of this paper. We assume adiabatic initial conditions 
and a spatially flat FRW model with the following fiducial 
cosmological parameters: Qde = 0.7, f2s = 0.05, Q. v = 0, 
n = 1, h = 0.7, as, = 0.9. Although we will base most of our 
analyses on this fiducial model, we have also run a set of 1000 
MC simulations for a more strongly dark-energy dominated 
ACDM model with Q,de = 0.8 (other parameters remain as 
in our fiducial model) . This will allow us to test how robust 
are our main results to changes around our fiducial model. 

Galaxies are distributed in redshift according to an an- 
alytic selection function, 

Z \e^'^' 2 (3) 



dN 

dz 



2zl 



where z m — 1.412 zo is the median redshift of the source 
distribution, and by definition, J dN /dz = 1. Note that for 
such selection function, one can show that its width simply 
scales with its median value, cr z ~ z m /2. For convenience we 
shall take z m = 0.33 as our fiducial model. For all the sky 
we have set the monopole (£ — 0) and dipole (£ = 1) con- 
tribution to zero in order to be consistent with the WMAP 
data. 

We have run simulations for surveys covering different 
areas, ranging from an all-sky survey (fsky = 1) to a survey 
that covers only 10% of the sky (f s k y = 0.1). The latter is 
realized by intersecting a cone with an opening angle of 37° 
from the north pole with the sphere. Larger survey areas are 
obtained by taking larger opening angles. For f e k y =0.1 we 
have done the same analysis taking a compact square in the 
equator (with galactic coordinates I = 0° to I = 66° and 
b = —33° to b = 33°) and have found similar results. We 
note that the wide f s u y = 0.1 survey is comparable in area 
and depth to the distribution of main sample galaxies in the 
SDSS DR2-DR3. 



1 http://healpix.jpl.nasa.gov/ 

2 As a check, we will compare below, in Fig. 10, the results of 
the MCI simulations, which have a fix galaxy map with Gaussian 
statistics, with the results using the observed SDSS DR5, which is 
not Gaussian. We find no significant differences, indicating that 
the level of non-Gaussianity in observations does not influence 
much the error estimation. 



2.1.1 Clustering in the simulations 

We have computed the angular 2-point correlation function 
for the galaxy over-density wgg, the temperature wtt and 
their cross-correlation wtg , as well as their (inverse) Legen- 
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Figure 1. 2-point correlation function (top panels) and angular 
power spectra (bottom panels) for all-sky surveys with median 
depth z m = 0.3. Different panels correspond to TT (Temperature- 
Temperature), GG (Galaxy-Galaxy) and TG (Temperature- 
Galaxy) cross-correlation. Errors shown correspond to dispersion 
over Montecarlo MC2-type simulations binned with A^ = 18 (see 
text for details) 



dre transforms, i.e, the angular power spectra, 
w{6) 



Ct = 2n dcosO w(0) P e (cos8) 



(4) 



(5) 



where we denote by Pt the Legendre polynomial of order I. 
In real space, we define the cross-correlation function 



Figure 2. Same as Fig. 1 but for a wide field survey f s ^ y = 0.1 

as the expectation value of galaxy number density 8q and 
temperature At fluctuations: 

N G 



Sa = <iV G > ~ 
At = T-T (in uK) 



(6) 
(7) 



at two positions hi and hi in the sky: 

w T g{0) = (A T (ni)<5G(n 2 )>, (8) 

where — |n 2 — ni|, assuming that the distribution is sta- 
tistically isotropic. To estimate wtg (0) from the pixel maps 
we use: 



w T g(0) = 



Ar(fij) Sgjhj) 
Npairs 



(9) 
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Figure 3. Convergence in normalized covariance matrix for 
Monte-Carlo simulations (MC2). We compare covariance when we 
increase the number of simulations. Here we show the difference 
between a) the first 100 with respect to the first 200 simulations 
(100-200), b) 200-400, c)400-700, d)700-1000. Results show that 
one needs at least 700 simulations for the normalized covariance 
to converge (we use 1000 in our analysis). 
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Figure 4. Convergence in diagonal error for Monte-Carlo simula- 
tions, ft is shown the error for the first 100 simulations (dotted), 
200 (dashed), 400 (dash-dot), 700 (dash-dot-dot) and 999 (solid 
line). With 200 simulations the accuracy in errors is about 20%. 



where the sum extends to all pairs i,j separated by 9± AO. 
Survey mask and pixel window function effects have been 
appropriately taken into account using SpICE (Szapudi et 
al. 2001a, b). This code has been probed to yield correct re- 
sults not only on simulations but also on real data from 
surveys with partial sky coverage and complex survey ge- 
ometries (Fosalba and Szapudi 2004). In Fig.l we show re- 
sults from the all-sky MC2 simulations, whereas Fig. 2 dis- 
plays the same for a survey covering 10% of the sky alone 
{f sky = 0.1). Errorbars are computed as the rms dispersion 
over the MC2 simulations. For the CYs, we use linear bins 
with A£ — 18, to get approximately uncorrelated errobars 
for f s ky —0.1 (see Fig. 7 and §3). As shown in the plots, our 
all-sky simulations are unbiased with respect to the input 
fiducial model (continuous lines): the mean over 1000 simu- 
lations lies on top of the theoretical (input model) curve. For 
finite area surveys, sample variance makes measurements on 
the largest scales (i.e, lower £'s) fluctuate around the input 
theoretical model 3 . 



2.1.2 Convergence in simulations 
The MC covariance is defined as: 



3 When we calculate the cross-correlation in a fraction of the 
sky, there is a residual monopolc in the galaxy and temperature 
maps, which changes the normalization of both fluctuations. In a 
real survey we are limited by the survey area covered by galaxies 
and we need to normalize the fluctuations using the local mean, 
which is in general different from the mean in all sky (because of 
sampling variance). We find that the cross-correlation calculated 
with the local normalization with f s k y =0.1 is about 10% lower 
for our fiducial ACDM model, but the value can vary for others 
models and different f s ky 



C ^ = ^E^t^Meft) (10) 

k=l 

Aw TG (9i) = W TG {0i) - w T g{0z) (11) 

where Wtg(#<) is the measure in the k-th simulation (k=l,...M) 
and WTG(di) is the mean over M realizations. The case i=j 
gives the diagonal error (i.e, variance). 

In order to check the numerical convergence in the com- 
putation of the covariance matrix, we compare the results 
using all 1000 (MC2) simulations with the ones using the 
first 100, 200, 400 or 700 simulations. For clarity we sepa- 
rate our converge analysis into the diagonal elements (the 
variance) and normalized covariance, where we divide the 
covariance by the diagonal elements (see Eq.22). As shown 
in Fig. 3, we find that there is no noticeable difference (~ 5% 
acuracy) in the normalized covariance from 700 and 1000 
simulations. This suggests that 700 simulations are enough 
for our purposes. To be safe, we shall use all 1000 simulations 
to derive our main results. 

On the other hand, Fig. 4 shows the convergence on the 
variance estimation (diagonal elements of the covariance ma- 
trix) for an increasing number of simulations. One needs 
about 200 simulations to converge within 20% accuracy. 
This is similar to the dispersion in the errors for a given 
realization due to sampling variance, see Fig. 12 below. We 
will use 1000 simulations which will give us better than 5% 
accuracy in the error estimation from these simulations. 

2.1.3 Simulations with a fixed galaxy map (MCI) 

We can also calculate montecarlo errors by cross-correlating 
1000 simulations of CMB with a fixed sky for galaxies (MCI). 
This is a common practice because it is quite easy to sim- 
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ulate CMB maps and not so easy to simulate galaxy maps. 
In this case the common (and easiest) thing to do is not to 
include any cross-correlation signal in the simulated CMB 
maps. Thus, this approach represents two levels of additional 
approximations: no variance coming from the galaxy maps 
and no cross-correlation within the maps. Despite these ap- 
proximations one expects MCI errors to be reasonably accu- 
rate because most of the variance should come from the large 
scale primary CMB anisotropies, and the cross-correlation 
signal is small in comparison. 

Here we want to test in detail what is the accuracy of 
this approach. We have taken the mean of 20 different cases. 
Each case has a different fix galaxy map which is paired with 
999 CMB maps, which are not correlated. For each fixed 
galaxy case we obtain a MCI error, so we can calculate the 
dispersion of this error with the 20 different galaxy maps. 
This will be discussed in more detail in §4.3. 

2.2 Jack-knife errors (JK) 

The JK method is closely related to the Boostrap method 
(Press etal 1992) which under certain circumstances can pro- 
vide accurate errors. The idea is that the data is grouped in 
sub-regions or zones which are more or less independent. 4 
We then use the fair sample hypothesis (ie ergodicity) to 
estimate the error (variance between zones) for the quan- 
tity under study. In the Boostrap methods one defines new 
sub-samples (which approach statistically independent real- 
izations) by a random selection of sub-regions. In the Jack- 
knife method each new sub-sample contains all sub-regions 
but one. A potential disadvantage of the JK error is that one 
may think that it can not be used on scales that are compa- 
rable to the sub-regions size. This is not necessarily so. Rare 
events (such as superclusters) can dominate sampling errors 
on all scales even if they only extend over small regions (see 
Baugh et al . 2001). If JK sub-regions are large enough to 
encompass these rare events, they can reproduce well errors 
on all scales. Nevertheless it is clear that a danger with JK 
errors is that the result could in principle depend on the size 
and shape of the sub-regions. So this needs to be tested in 
each situation. 

We can therefore calculate the error from each single 
map using the JK method. To study the JK error in a frac- 
tion of the sky of 10%, we divide a compact square area in M 
zones or sub-regions. Fig. 5 shows the case M = 36, but we 
have tried different values for M = 20 — 80, and find similar 
results. The JK regions have roughly equal area and shape. 
This is important; we have found that the JK method could 
give unrealistic errors when the areas or shapes are not even. 
To calculate the covariance, we take a JK sub-sample to be 

4 It is not adequate here to consider individual points or pixels 
as the units (sub-regions) to boostrap because they are highly 
correlated. 




Figure 5. Compact square with 36 zones, covering a 10/used to 
calculate JK error in galactic coordinates (I = 0° to I = 66° and 
b = —33° to b = 33°). We see that the shapes and sizes of the 
zones are similar. 

all the data removing one of this JK zones, this means that 
we remove all the pairs that fall completely or partially in 
the JK zone that is removed. To compensate for the correla- 
tion between the JK sub-samples, we multiply the resulting 
covariance by M — 1. The covariance for this case is thus: 

M 

Cn = J2 Aw k TG (e 1 )Aw k TG (e ] ) (12) 

*:=i 

A» TG (fl,) = v&aiOi) - WTGiOi) (13) 

where w^ G (di) is the measurement in the k-th sub-sample 
(k = 1, ...M) and Wtg(^j) is the mean for the M sub- 
samples. For each of the MC2 pair of simulated maps we 
have a JK estimation of CV,-. We can therefore calculate a 
JK mean and its dispersion (and distribution) to compare 
to the true MC2 covariance in the maps. 

2.3 Errors in harmonic space (TH) 

Theoretical expectations for the errors are the simplest in 
Harmonic space where the covariance matrix is diagonal in 
the all-sky limit. In particular, for Gaussian fields, one can 
easily see that the variance (or diagonal error) is, 

a*c- = 7 -J ZTTy [(cf 3 ) 2 + crcr\ . d4) 

This indicates that the variance of the power spectrum esti- 
mator results from quadratic combinations of the auto and 
cross power, with an amplitude that depends on the num- 
ber of independent m-modes available to estimate the power 
at a scale I, which is approximately given by f s k y (2£ + 1). 
We shall emphasize that this is only approximate and rig- 
orously it is only expected to yield accurate predictions for 
azimuthal sky cuts. However, as we shall see later, this re- 
sult is of more general applicability. We note that the dom- 
inant contribution to the error and covariance comes from 
the auto-power of the fields Cj T Cf a involved in the cross- 
correlation, whereas the cross-correlation signal {Cj G ) 2 only 
gives a few percent contribution, depending on cosmology 
and survey selection function. 
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Partial sky coverage introduces a boundary which re- 
sults in the coupling (or correlation) of different £ modes: 
the spherical harmonic basis in no longer orthonormal on 
an imcomplete sky. Thus the covariance matrix between dif- 
ferent modes, 

Cov(C e ,C e ,) =< (C t - < C e >)(C e - < C v >) > (15) 

is no longer diagonal (ie see Fig. 7). Because of the partial 
sky coverage there is less power on the smaller multipoles. 
This results in a systematic bias on the low multipoles of 
Ce that can sometimes be modeled with the appropriate 
window correction of the survey mask. 

Using the Legendre transform one can propagate the 
error ACt in Eq.14 above to configuration space, 

A 2 w(8) = J2 (^r) P h») A 2 C,, (16) 
where fx = cosd. For the covariance matrix, we find: 
CV, = Cov(w(e,),w(9j)) (17) 

= E(^r) 2 Ptim) PtM ± a c t , 

where fu ee cosOi. Eq(17) and Eq(16) assumes that different 
I multipoles are uncorrelated which is only strictly true for 
all-sky surveys. We shall see below that this approximation 
is quite accurate anyway even for surveys that cover only 
10% of the sky, i.e, cosmological parameter contours derived 
from this expression do not significantly differ from those 
computed with simulations that take into account the exact 
covariance matrix. 



the survey area if they are thrown randomly on the full sky. 
For partial sky surveys these probabilities depend mainly on 
the survey area and can be well approximated by the for- 
mula provided in Appendix A. Particularly simple analytic 
expressions can be obtained for a "polar cap" survey (area 
obtained by intersecting a cone with the sphere) and are 
given in the Appendix B. 

This new method of computing errors in real space has 
several advantages. Since it takes into account the survey 
geometry, it can provide more accurate errors at large an- 
gles where both the jackknife errors and the harmonic-space 
errors become more inaccurate. Compared to montecarlo er- 
rors this method is faster because one does not need to gen- 
erate a large number of sky realizations. What is more, this 
estimator does not need to rely on any theoretical/fiducial 
model and one can readily apply it to correlation functions 
measured on the real data to estimate the errors. 5 



2.4 Errors in configuration space (TC) 



The cross-correlation function in configuration space is es- 
timated by averaging over all pairs of points separated an 
angle 6 in the survey, 

w T g(0) =< AT(q)S g (q) \ q ~ ql=e > survey • (18) 

We have derived a formula for the covariance of the 
estimator in an ensemble of sky realizations. Details of this 
derivation can be found in Appendix A, 

i r/q0iA,v] 



P 7T 



p(V) 



where the kernel K is given by: 

K[e,e',ip] = i [w T T(e,i>)w G G{e',i>)+ 

Wtt(8' ,i>)W GG (e,4>)] + Wtg{0,iP)Wtg(0' » 



sin %p dtp (19) 



(20) 



and Wx is a mean over the corresponding correlation w x , 
with X = TT, GG or TG: 

Wx(e,tp) =2 dipP(i>,e,<j>) wx{<t>) (21) 
Jo 

where cos(f> = cost) costp + sinO sinip cos(f>. Survey geom- 
etry is encoded in P(9) and P{il>,9,4>) probabilities. These 
are the probabilities for two points separated by an angle 
6 or for a triangle of sides tp, 6,4> to fall completely into 



5 A FORTRAN code (named TC-ERROR) which takes as in- 
put wtg i w TT ,and wgg an d compute the covariance matrix and 
errors for the cross-correlation function, can be obtained upon re- 
quest from the authors (please contact Marc Mancra). Of course, 
this code can also be used to estimate the autocorrelation error 
in a single map by just placing wtg = wtt = 0. 
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COV MC2-w 10% 



COV JK-w 10% 



COV MC1-w 10% 






COV TC-w 10% 



COV TH-w 10% and all sky 



COV MC2-W all sky 






Figure 6. Normalized covariances in real space for different methods as labeled in the figure. No significant changes are found for 
different methods and sky fractions used. 
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Figure 7. Normalized covariances in C'i space from montccarlo (MC2) method. The covariance becomes progressively dominated by its 
diagonal elements as we increase the sky fraction f 3 k y - 



3 NORMALIZED COVARIANCE MATRIX 

For each one of the methods presented in the previous sec- 
tion, we next compare the normalized covariance: 

yj (CuCjj) 

The diagonal values (the variance) and associated dispersion 
will be investigated in §4. 



3.1 Configuration space 

As shown in Fig. 6 all the normalized covariances C\j in real 
space are very similar. The appearance of the plots does not 
seem to depend strongly on the method we use to estimate 
them, or the survey area f s ky Here we only show results for 
10% and all the sky, but intermediate values yield similar 
results. However, we want to question if slight differences in 
the covariance could have a non-negligible impact on cosmo- 



logical paramter estimation. We will discuss this in detail in 
§5.3. 



3.2 Harmonic space 

In Ci space, there is no correlation between different Z-modes 
(bins A£ = 1) for the case of all sky (MC2) maps. The nor- 
malized covariance matrix is diagonal, as can be seen in the 
right panel of Fig.7. Also shown, in the left and central pan- 
els, are the results for 10% and 40% of the sky, where the 
covariance between modes gives rise to large amplitude off- 
diagonal elements. This is in sharp contrast to the results 
in configuration space (in Fig. 6) where there is no signifi- 
cant difference between normalized covariances in real space 
when we decrease the area. This is because the main effect 
of increasing the area in configuration space is the reduction 
of diagonal errors (which are shown in next section), while 
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Figure 9. Eigenvectors in real space for 10% of the sky and for all sky, as labeled in each panel. First eigenvector is shown as solid lines, 
second as dotted, third dashed and fourth dot-dash. Results for MCI — w and TC — w, not shown here, are very similar. 



in harmonic space there is a transfer of power from diagonal 
to off-diagonal elements. 



3.3 Eigenvalues and Eigenvectors from SVD 

To calculate the distribution x 2 and the signal to noise we 
need to invert the covariance matrix. We use the Singular 
Value Decomposition method to decompose the covariance 
in two orthogonal matrices U and V and a diagonal matrix 
W which contains the singular values A; squared on the di- 
agonal (eg see Press etal 1992). This method is adequate to 
separate the signal from the noise: 

Ctj = {U&WuVij (23) 

where Wij = A 2 <5;j and CV,- is the normalized covariance 
in Eq.22. By doing this decomposition, we can choose the 
number of modes that we wish to include in the analysis. 



This SVD is effectively a decomposition in different modes 
ordered in decreasing amplitude. 

We obtain very similar singular values for each mode 
and for each method, as show in Fig. 8 for some of the cases 
(other cases give very similar results). 

We can understand the effect of modal decomposition 
looking at the eigenvectors shown in Fig. 9, where we have 
plotted the four dominant eigenvectors as a function of an- 
gle: first mode (solid) affects only the amplitude, second 
mode (dotted) shows a bimodal pattern. The following modes, 
third (dashed) and fourth (dot-dash), correspond to mod- 
ulations on smaller angular scales. As can be seen in the 
Figure, we obtain nearly the same eigenvectors in all the 
cases, in agreement to what was found by direct comparison 
of the covariance matrices in Fig. 6. Again, we can ask: are 
the small differences significant? We will study this in detail 
in §5. 
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Figure 10. Error calculated with different methods (as labeled 
in the figure) in real space for Que = 0.7. For a map covering 
10% of the sky (top lines and symbols), the TC (crosses) and 
TH (dashed line) theoretical errors work well compared to the 
montecarlo MC2 simulations (solid line), while MCI simulations 
(dotted line) seems to underestimate the errors by 10%. The JK 
method (triangles) seems slightly biased up/down on large/small 
scales, although all the errors are compatible given the sampling 
variance dispersion we expect (see Fig. 12). For all sky maps (bot- 
tom lines and symbols), we show how results for MC2 (solid), TH 
(dashed) and TC-w (cross) agree very well. 

4 VARIANCE & ERRORS 
4.1 Variance in w(6) 

Fig. 10 is one of the main result of this paper. We compare 
the variance for the different methods, which is the diago- 
nal part of the covariance matrix. By construction, in the 
limit of infinite number of realizations, the MC2 error from 
simulations should provide the best approximation to the 
errors. We have demonstrated (in section §2.1.2) that 1000 
simulations are enough for convergence within 5% accuracy. 
For all sky maps (lower lines in the Figure) we can see that 
the three methods used: MC2-w, TH-w and TC-w, yield 
identical results, as expected. For smaller survey areas we 
do expect some deviations, because of the different approx- 
imations on dealing with the survey boundary. For a survey 
covering 10% of the sky these 3 methods also agree well up 
to 10 degrees. At larger scales TH-w (dashed lines) starts to 
deviate, because boundary effects are in fact not taken into 
account in this method. The JK error (triangles) has a slope 
as a function of 9 that seems less steep than the other meth- 
ods, but still gives a reasonably good approximation given 
that the dispersion in the errors is about 20% (as discuss in 
§4.3 below). Note how on scales larger than 10 degrees the 
JK method performs better (ie it is closer to MC2) than the 
TH-w error. The TC method seems to account well for the 
boundary effects, as it reproduces the MC2 errors all the 
way to 50 degrees, where all other methods fail. 

If we only use one single realization for the galaxies 
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Figure 11. Error at a two fix angles: 5 deg (upper lines) and 
20 deg (lower lines) as a function of f s k y , the fraction of the sky 
covered by the map. As predicted, errors decrease as 1/v/ f s k y 
in both cases. Note how for small areas, the TC-w prediction 
(continuous line) performs better than the TH-w model (dashed 
line) as it better reproduces the MC2 simulations (squares). 

(MCI) the error seems to be systematically underestimated 
by about 10% on all scales. This bias is expected as we have 
neglected the variance in the galaxy field and the cross- 
correlation signal. A particular case of MCI is done with 
real data from SDSS DR5 (shown as long dashed line in 
Fig. 10). We have used here a compact square of 10% of the 
sky from the SDSS r magnitude slice of 20-21, which has 
a redshift selection function similar to the one in our simu- 
lations (z m — 0.33). This case works surprisingly well once 
scaled with linear bias b (estimated by comparing the mea- 
sured galaxy auto-correlation function with the one in our 
fiducial ACDM model). It happens to closely follow the JK 
prediction, rather than the MCI prediction, but we believe 
this is just a fluke, given the dispersion in the errors (see 
§4.3) and the uncertainties in the fiducial model. 

4.2 Effect of partial sky coverage 

We have tested MC2-w, TH-w and TC-w for different partial 
sky survey areas f s k y and obtained similar results. In Fig. 11 
we have plotted the error for a fix angle of 5 degrees (top) 
and 20 degrees (bottom) for the different values of fsky The 
three methods coincide for large areas. The error scales by 
a factor 1 / \J fsky, as expected. 

Notice that errors at angles comparable to the width of 
the survey are difficult to estimate theoretically because one 
needs to take into account the survey geometry. Even for a 
map as wide as 10% of the sky, the survey geometry starts 
to be important for errors in the cross-correlation above 10 
degrees. This is shown in simulations as a sharp inflection 
that begins at 30 degrees in Fig. 10 (solid line) . Our new 
TC-w method predicts well this inflection, while the more 
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Figure 12. Dispersion of the error calculated with different meth- 
ods in real space for Qde = 0.7. For MC2-w (solid line) or MCl-w 
(dotted lines), we take each pair of MC2 or MCI simulations as 
the input for the error in the TH-w calculation in Eq. 16 . For JK 
(triangles), we have one error for each MC2 simulation. We can 
see how the error in the error is of other 20% for MC2 or JK and 
is lower for MCI (mainly because one of the maps in each pair 
is kept fixed). The lower line corresponds to the dispersion in all 
sky maps. 



traditional TH method totally misses this feature. This can 
also be seen in Fig. 11 for 20 degrees when we approach small 
values of f s k y ■ 



4.3 Uncertainty in w(6) errors 

To assess the significance of the differences in the error es- 
timation that we find using different methods, we will com- 
pute here the sampling uncertainties associated with error 
estimation. Fig. 12 shows the sampling dispersion in the er- 
ror estimates. This can be calculated from the TH and TC 
approaches by using Ci or w{6) measure in each realiza- 
tion as the input model for theoretical predictions (Eq.16 or 
Eq.19). In Fig. 12 solid (or dotted) line shows the result of 
using Eq.16 for each of the MC2 (or MCI) simulations. This 
produces an error for each realization and we can therefore 
study the error distribution. The uncertainty in the error 
(or error in the error) correspond to the rms dispersion of 
this distribution. We need all the multipoles to compute this 
Legendre transformation (Eq.16) although we lose some in- 
formation for low multipoles when we use only a fraction of 
the sky. The error propagation Eq.16 is not linear and we 
find that this produces a bias of 3% when we compare the 
mean of the propagated errors in each simulation with the 
propagation of the mean error in all simulations. 

We can also calculate the JK-w dispersion of the er- 
ror, because we have the JK error for each MC2 simulation 
(remember that we only need one realization to obtain the 
JK error). The JK-w dispersion (triangles) in Fig. 12 is quite 
close to the MC2-w values. They are both of the order of 
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error JK 
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Figure 13. Histograms show the error distribution in JK errors 
from 1000 simulations at different angles. Solid line shows a Gaus- 
sian with the same mean and dispersion. Vertical lines correspond 
to mean JK error (solid) and the true mean MC2 error (dotted). 



20% relative to the mean error. This uncertainty can be 
interepreted as the result of the uncertainties in our input 
model; typically the model is only known to the accuracy 
given by the data and a given sky realization will deviate 
from the 'true' model (i.e, the mean over realizations). Thus, 
if one chooses to use the estimated values from the data (or 
it's bets fit model) as input to the error estimation, this 
produces an uncertainty in the error which is of the order 
of this scatter. This is always the case with the JK errors, 
which do not use any model, but the uncertainty is similar 
if we use direct measurements as input to the other error 
estimations, as shown in Fig. 12. 

For completeness, Fig. 12 also shows the dispersion for 
the MCI error (dotted). There is less dispersion in the MCI 
method because one of the maps is always fixed and this 
reduces both the error and, more strongly, its dispersion. 



4.4 Error distribution for JK 

Fig. 13 shows the distribution of JK errors in the MC2 sim- 
ulations as compared to a Gaussian fit with the same mean 
and dispersion. Each panel shows the distribution of w(9) er- 
rors at a given fixed angle. The mean MC2-w error (shown 
as solid line in Fig. 10) is shown here by a dotted vertical 
line, while the mean of the JK errors (shown as triangles 
in Fig. 10) corresponds here to the continuous vertical line. 
We can see here how the MC2-w error and the mean JK- 
w error are quite similar. The variance in the distribution 
agrees with the results in §4.3 above. Note also that the JK 
distribution of w(ff) errors can be well fitted by a Gaussian. 
This is important for two reasons. First it shows that there 
are no important outliers or systematic bias when one uses 
a JK estimator in a single realization, as is the case with 
real data. Second, it indicates that the error in the error (ie 
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dashed lines: Unbinned Theory 
jagged lines: Unbinned Simulations 
solid lines: Binned Theory 
symbols: Binned Simulations 




Multipole 

Figure 14. Errors in Ci space calculated with (MC2) simulations 
as compare to (TH-C^) theory. For all sky maps, the theoretical 
prediction works well, but for 10% of the sky we see a big discrep- 
ancy between theory (dashed lines) and simuations (jagged lines). 
This is due to covariance between modes and can be solved by 
binning the Ci spectrum, as shown by the symbols (simulations) 
and sold line (predictions in Eq.24). 

the rms dispersion of this distribution) entails all relevant 
information needed to asses in more detail the accuracy of 
the JK error analysis. One could for example fold the un- 
certainties in this distribution to asses the significant of a 
detection. 

4.5 Variance in Ct 

In Ct space, we have compared MC2 errors to TH theory. 
Fig. 14 shows how both errors are hard to distinguish for the 
case of all sky (middle dashed line matches closely the jagged 
line). Note the shape of the Ct errors exhibits a broad peak 
around I — 200 illustrating the fact that errors are domi- 
nated by the CjT term. For 10% of the sky the TH error 
(upper dashed line) obtained theoretically from Eq.14 (with 
a factor 1 / «/ fsky respect to all the sky) is much larger than 
the MC2 error (upper jagged line) in the simulations. As we 
have shown in Fig. 7, there is a strong covariance between 
different bins when f s k y < 1, this is in contrast with the 
TH estimation in Eq.14 which assumes a diagonal covari- 
ance matrix. We understand this discrepancy in the vari- 
ance prediction as a transfer of power from the diagonal to 
off-diagonal errors. 

We can get a better diagonal error estimation by bin- 
ning Ci in a Al that makes the covariance approximately 
diagonal. 6 When binning by Al, the theoretical error (TH- 

6 This is clear in Fig. 7 which shows that the covariance is con- 




1 10 
S(deg) 

Figure 15. Relative error for the two fiducial models Qde = 0.7 
(top) and Qde = 0.8 (bottom). The different methods are labeled 
in the figure. We see that the relative error depends on the model, 
and that in the case Qde = 0.8 there is also a good agreement 
within the errors. 

CI) in Eq.14 is reduced in quadrature to: 

^- ai/Jm+i) l^ G r + crcr], ( 24) 

This assumes that the bins are independent. Because of the 
partial sky coverage, the bins are not independent and the 
above formula will only be valid in the limit of large Al. 

We have tested the above formula for different sky frac- 
tions by binning the Ct spectrum in the simulations and 
estimating the error from the scatter in different realiza- 
tions. We find that the formula works above some minimum 
Al which roughly agrees with the width of off-diagonal cou- 
pling in the covariance matrix estimated from simulations 
(Fig.7). We find that A^=20,16,8,l for / 8fc „,=0.1,0.2,0.4,0.8 
respectively, diagonalize the covariance matrix and provide 
a good fit to the above theoretical error for binned spectra. 
In Fig. 14 we show the results for Al = 20 for both all sky 
(triangles) and 10% of the sky (squares) . The theoretical pre- 
diction in Eq.24 (solid lines) works very well in both cases, 
because the covariance with this binning is approximately 
diagonal. 

4.6 Dependence on Qde 

Fig. 15 shows a relative comparison of how our error esti- 
mation changes for a different cosmology with Qde = 0.8 
instead of Q D e = 0.7. The MC error still fits well the TH 
and TC predictions, but the JK errors seem to underesti- 
mate the errors more than in the Qde = 0.7 case. This 

fined to a finite number of Al of off-diagonal elements. It is also 
apparent in Fig. 14 where the jagged line forl0% of the sky is 
clearly correlated on scales of Al ~ 20, in contrast to the all-sky 
jagged line which shows no correlation from bin to bin. 
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Figure 16. Signal-to-noise for 10% of the sky for each singular 
value. Different lines correspond to methods as labeled. 
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Table 2. Signal-to-noise as a function of f 3 k y covered in a survey 
with a broad distribution of sources with median redshift z m = 
0.33 for the Vl DE = 0.7 flat ACDM model with different error 
assumptions (see Tablel). Similar results are found for TC-w and 
JK-w methods. 



effect is not large given the dispersion in the errors from 
realization to realization (errorbars in Fig. 15). 
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Table 3. Signal-to-noise for all sky maps in harmonic space for 
two different values of Q.£>E an d for galaxy maps with different 
mean depths z m . The width of the redshift distribution is given 
by <r 2 ~ z m /2 (see Eq.3) except in the last case (z m = 1 Narrow), 
where cr z ~ 0.17 



Fig. 16 shows the S/N for each singular value. All methods 
agree well even for 10% of the sky. We get excellent agree- 
ment for all sky maps. 

Because eigenvectors are orthogonal the total S/N is 
just added in quadrature: 



N 1 t ^ V N 



(26) 



Table 2 compares total S /N values from simulations and 
theory for different survey areas. Here by Simulations we 
mean the MC2-w method where we have used 6 singular 
values and theory refers to the different methods, including 
the TH-Cf approach (see below). We note that we find ap- 
parently lower values than quoted in the literature (see e.g, 
Afshordi 2004). This is due to the low value adopted for 
£Ide (i.e Qde = 0.8 models yield a S/N ratio ~ 2 larger 
than our fiducial value Qde = 0.7), and the fact that these 
are predictions for a single broad redshift bin (similar to 
the selection function for SDSS main sample galaxies), with 
median redshift z m ~ 0.33. A combination of several nar- 
row bins at different redshifts will also increase the S /N (see 
Fig. 17 and Table 3 below). 



5 CONSTRAINTS AND SIGNIFICANCE 

ISW measurements can directly constrain dark-energy pa- 
rameters independent of other cosmological probes. Here we 
shall use the covariance analysis presented in the previous 
section to derive significance levels for the cosmological pa- 
rameter constraints obtained from a cross-correlation anal- 
ysis. 

5.1 Signal-to-noise from w(0) 

The signal-to-noise (S/N hereafter) depends on both the 
input fiducial model used in the simulations and the covari- 
ance matrix method we implement. In this paper we shall 
invert the covariance matrix using the standard method of 
singular value decomposition (SVD), see §3.3. In this ap- 
proach one projects the signal to the eigenvector space of 
the thus diagonalized matrix and only the most significant 
eigenvalues are kept for the analysis, 

fS\ _ |WTG(i) I _ I 1 tA wtg{j) | , s 

3=1 



5.2 Signal-to-noise forecast from Ci 

In harmonic space the S/N is estimated as 

using Eq. 14 in the denominator. Note in particular that the 
dominant contribution to ACtg in Eq. 14, comes from the 
Ctt Cgg term and not from Ctg which is an order of mag- 
nitude smaller. This means that the (S/N) 2 approximately 
scales as: 

(§)>Ef^p-s 2 (28) 

and therefore depends strongly on the normalization of the 
dark matter power spectrum P(k), and is independent of 
the galaxias bias b. 

Clearly, the S/N will change depending on the fiducial 
model used. Fig. 17 shows this dependence on the plane DE 
density vs. equation of state, w. Each panel corresponds to 
different smooth redshift distributions that closely match 
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Figure 17. All sky values of the S/N for different models. Each 
panel corresponds to different redshift distribution: SDSS (z m = 
0.3), DES(z m = 0.7), DES+V1STA (z m = 1) and DES+V1STA 
NARROW (z m = 1). Expectations for smaller survey areas are 
obtained by scaling the displayed values by ^/f a ky, where f s k y is 
the sky fraction covered. 



current or planned surveys. The upper panels show predic- 
tions for SDSS main sample (z m — 0.33), that anticipated 
for the DES {z m = 0.7), and a combined DES+VISTA sur- 
vey (zm = 1), respectively. For these 3 surveys we use broad 
distributions as given by Eq.(3), with a width that grows 
linearly with depth, a z ~ z m /2. For this rather generic 
parametrization of the selection function, the S/N monoton- 
ically increases with z m as shown by the 3 upper panels in 
Fig. (17), although the differential contribution, d(S/N)/dz, 
drops for sources at z ^ 0.4 (see Afshordi 2004 for an ana- 
lytic account of this effect). 

In particular, for our baseline survey, SDSS, and our 
fiducial ACDM model, we estimate S/N — 3.8, what is in 
good agreement with simulations in configuration space (see 
Table 2). As we sample a wider range of the ISW signal in 
redshift, the S/N raises by ~ 60% when we increase the 
survey depth by a factor ~ 2 to match the depth of the 
DES-like survey. However, there is little gain in ISW detec- 
tion significance when combining DES+VISTA, as the S/N 
only increases by an additional 5% with respect to the DES 
survey. For comparison, we also show the case of what we 
shall call DES+VISTA NARROW survey. This survey has 
a Gaussian distribution of sources around z m — 1, but with 
a narrow width, similar to that of SDSS above [a z — 0.17). 
In this case, the high redshift population of sources brings 
a poor added value to the baseline survey (SDSS) by im- 
proving the S/N by only 10%. As shown in Table 3 these 
conclusions vary somewhat for different values of Qde- 

We point out that in these estimations we have ignored 
the lensing magnification bias contribution (see Loverde, Hui 
& Gaztanaga 2006) which could be important for z > 1. 



5.3 



estimation 



We shall discuss below to what extent the choice of covari- 
ance matrix estimation method affects cosmological param- 
eter constraints. This is specially relevant because current 
ISW detection significance levels are still rather poor (i.e, at 
the 4-0" level at most, see Cabre etal 2006) and the practical 
implementation of methods might yield noticeably different 
results. 

We shall compare the methods described in §3, whereas 
the fiducial model is the one implemented in the simulations. 
Our significance levels are derived from a x 2 statistic: 



where: 



JV 

_ (^ G (0O-< G (0i)) 



(29) 



(30) 



O-TG(Vi) 

is the difference between the "estimation" E and the model 
M. We have run models for Qde from 0.5 to 0.9 and for 
w from -3.0 to -0.2 and we fix the estimation E to be our 
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MC2-W vs JK- 



MC2-W vs MCI 



MC2-W vs TC-w 




Figure 18. x 2 contours from (MC2-w) simulations (in color) compared to the other methods (solid line): JK-w, MC1- 
TH-C^ as labeled in each panel. All cases correspond to 10% of the sky. The contour levels are: 0.25, 1., 4. and 9. 



TC-w, TH- 



MC2-W 20% sky vs TH-CI 20% sky 




MC2-W 40% sky vs TH-CI 40% sky 




MC2-W 80% sky vs TH-CI 80% sky 




0.5 0.6 



0.8 0.9 



Figure 19. Same as Fig. 18, but here each panel compares to the method TH-Cf for 20%, 40% and 80% of the sky. 



fiducial model Qde — 0.7 and w = — 1 which was input 
in the simulation. The size of the resulting confidence level 
contours depends implicitly on the best-fit model (i.e, the 
fiducial model) by construction. 

In each case, the error used is the one obtained from 
the simulations (for cases MC2-w, MCl-w, JK-w, MC2-C f ) 
or from the theoretical estimator (TH-w and TH-Cy for the 
given fiducial model. That is, the errors are not varied as we 
sample parameter space in the \ 2 estimation. This allows 
a direct comparison on the contours when using different 
covariance matrix estimators. 

Results are shown in Fig. 18, 19 and 20. In the different 
figures we compare the real space MC (MC2-w) result (col- 
ored contours) with the other methods (contours traced by 
solid lines). Contours from different methods agree remark- 
ably well: it does not depend neither on which space we com- 
pute the errors and covariance (real or harmonic space) , nor 
in the portion of the sky used. We have checked that small 
contour differences are compatible once we take into account 



uncertainties in the errors, as shown in Fig. 12. Moreover us- 
ing a diagonal approximation for the Cg covariance matrix 
to infer the covariance in real space (through the Legendre 
transform in Eq.16), works for a small portion of the sky 
surprisingly well. As explained in section §4.5, when we use 
the theoretical error in Ce space (TH-Ce) for real data, we 
should use a bin of width of Al that varies with the portion 
of the sky. 

5.4 Best fit model 

In this section we investigate how the error method used af- 
fects the best fit estimation of cosmological parameters. We 
fix all the parameters as in the fiducial model, except for 
£Ide- We focus on the case of the angular 2-point correlation 
w(8) and compare results for JK-w to those for MC2-w. We 
do a % 2 fit of the correlation from each single simulation, 
which is used as the "E" estimator in Eq.30. This follows 
what is done with real data where the observations corre- 
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Figure 20. Same as Fig. 18 for all sky maps. Here solid lines correspond to methods TH-w (left panel) and TH-C1 (right panel). 
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0.12 ■ 



BEST FIT DISTRIBUTION FOR fi =0.7 



; MC2-w (10% sky solid line) 
0.10 - (all sky dotted) 

Jfr - JK-w (dashed) 

2 0.08 - 



all sky 




0.20 



0.15 - 



3 EST FIT DISTRIBUTION FOR fi =0. 



MC2-W (10% sky solid line) 

(all sky dotted) 
JK-w (dashed) 



g 0.10 - 



0.05 




0.00 



Figure 21. Distribution of best fit values for Qde fixing w = —1 
and all the other parameters to the fiducial model. For JK (dotted 
lines), we use the JK-w error obtained for each simulation. For 
the MC2 simulations (solid line), we use a fixed error obtained 
from the dispersion in the simulations. 



spond to a single realization. We can make a distribution for 
all the best fit values of Qde that we obtain from each real- 
ization, which is shown in Fig. 21. The error and covariance 
used in the fit is in one case the JK-w obtained using this 
single simulation (dashed line in Fig. 21) or the MC2-w cal- 
culated from all the simulations (continuous line). Despite 



these differences there is an excellent agreement between the 
JK-w and the MC2-w results. 

We see that the distribution of best fit values is biased 
towards higher values than the underlying fiducial model 
value Qde = 0.7. In partiuclar, the distribution of best-fit 
values is skewed, showing a long tail of values smaller than 
the input model. This is due to the fact that contours in 
\ 2 (and in the S/N) are not symmetric. The reason for this 
is the nonlinear mapping between values of Qde and the 
amplitude of w(0) . When the errors are large, this non-linear 
mapping transforms an approximate Gaussian distribution 
(which is a good approximation for the distribution of w(ff)) 
into a strongly non-Gaussian distribution in Qde- When the 
errors are smaller, as happens for larger Qde, the mapping 
between w(6) and Qde, is better approximated by a linear 
relation which results in a more Gaussian distribution. Thus, 
if we have small enough errors, this bias is negligible, as we 
can see for the all-sky case shown by the dotted lines in 
Fig.21. 



6 SUMMARY & CONCLUSIONS 

We have run a large number of pairs of sky map simula- 
tions, that we call MC2. Each pair is a stochastic realiza- 
tion of an auto and cross correlation signal, that we input to 
the simulation, what we call the fiducial model. We have fo- 
cused our attention in testing the galaxy-temperature cross- 
correlation, so each pair of simulations correspond to a CMB 
and a galaxy map. For the fiducial model we take the cur- 
rent concordance ACDM scenario. We have run simulations 
for different values of Qde and have tested maps with dif- 
ferent fractions of the sky. We have concentrated on the case 
Qde = 0.7 and f s k y = 0.1 which broadly matches current 
observations and results in large errors (> 50%). 

We are interested in error analysis/forecast and signifi- 
cance estimation. We calculate the correlation between maps 
and use the different realizations to work out the statistics. 
We then compare the results to the different approximations 
that have been used so far in the literature. One of the ap- 
proximations, that we call MCI, uses montecarlo simulations 
for the CMB maps with a fixed (observed) galaxy map (ie 
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with no cross-correlation signal or sampling variance in the 
galaxies). We test a popular harmonic space prediction, that 
we shall call TH (Theory in Harmonic space). We also test 
Jack-Knife (JK) errors which uses sub-regions of the actual 
data to calculate the dispersion in our estimator. Finally, we 
introduce a novel error estimator in real space, that call TC 
(Theory in Configuration space). For both models and sim- 
ulations we have assumed that the underlaying statistics in 
the maps is Gaussian. Our main results can be summarized 
as follows: 

a) The number of simulations needed for numerical con- 
vergence (to within ~ 5% accuracy) in the computation of 
the covariance matrix is about 1000 simulations (see §2.1.2). 

b) Diagonal errors in w(9) are very accurate in both TH 
and TC approximations for all sky maps. This is shown in 
the bottom lines of Fig. 10. For maps with different fraction 
of the sky f sky < 1, the agreement is also good on small 
scales (9 < 20 deg) as can be seen in Fig.ll. 

c) Even for a map as wide as 10% of the sky, the sur- 
vey geometry starts to be important for errors in the cross- 
correlation above 10 degrees. This is shown in simulations 
as a sharp inflection that begins at 30 degrees in Fig. 10 
(solid line) . Our new TC method predicts well this inflec- 
tion, while the more traditional TH method totally misses 
this feature. 

d) If we only use one single realization for the galaxies 
(MCI) the error seems to be systematically underestimated 
by about 10% on all scales. This bias is expected as we 
have neglected the variance in the galaxy field and the cross- 
correlation signal. 

e) The JK errors do quite well within 10% accuracy on 
all scales, including the larger scales where boundary effects 
start to be important (see triangles in Fig. 10). 

f) The dispersion in the error estimator (error in the 
error) for individual realizations is of the order 20% (see 
Fig. 12). This uncertainty is inherent to the JK method, be- 
cause one uses the observations (a single realization) to esti- 
mate errors. But it is also implicit in other methods because 
our knowledge of the models is limited by the data and can 
be thought of as a "sampling variance error" . 

g) S/N (see Fig. 16) and parameter estimation (see Fig. 
20 and Table 2) are equivalent when we do the analysis in 
configuration and harmonic space. This was expected for all 
sky maps, but it is not trivial for partial sky coverage (see 
comments below). 

h) It is possible to propagate errors and covariances from 
Ct to w(9) (harmonic to configuration space) using Eq.(16). 
Starting from a diagonal (all sky) covariance matrix in Ce, 
the resulting covariance matrix in w{9) is quite accurate as 
compared to direct estimation from simulations. 

i) The above propagation also works well for a map with 
a fraction f s k y of the sky, by just scaling the Ce errors by a 
factor 1 / y/Jsky respect to all the sky. This is surprising be- 
cause for f a ky < 1 the covariance matrix in Ce is no longer 



diagonal (see Fig. 7) and the actual measured Ce errors in 
simulations do not simply scale with 1/ sj fsky (see Fig. 14). 
Thus, Eq.(16) should not be valid. We believe that this 
works because the two effects compensate. There is a trans- 
fer of power from diagonal to off-diagonal elements of the 
covariance matrix which for the scales of interest (smaller 
than the survey area) seems to corresponds to a rotation 
that somehow does not affect the final errors from Eq.(16). 

j) If we want to use the popular TH approach in Eq.(14) 
with fsky < 1 we need to bin the Ce data in multipole bands 
of width A£. The binned spectrum has a diagonal covariance 
when Al is large enough and the error in the binned spec- 
trum approximately follows Eq.(24). 

k) When the errors are large (i.e., for partial sky cov- 
erage and ACDM models with not so large Qde) there is 
a signficant bias in the distribution of the recovered best-fit 
values of Q.de, as shown in Fig. 21. This is because of the non 
linear mapping between £Ide and the amplitude of w(9). 

1) S/N forecasts for future surveys, shown in Fig. 17 and 
Table 3, strongly depend on the fiducial model used. For 
example, an all-sky survey with broadly distributed sources 
around a median redshift z m = 1 and Qde = 0.8 can detect 
the ISW effect with a S/N~ 11. 

What method should be used when confronted with 
real data? Running realistic simulations seems the best ap- 
proach, but is very costly because we need of order 1000 sim- 
ulations for each model we want to explore. The theoretical 
modeling of errors seems quite accurate and is much faster 
to implement. The main advantage of the JK approach is 
that the errors are obtained from the same data in a model 
independent way. This is important because real data could 
surprise our prejudices and also because, in the ISW case, 
the errors are very large and the data can accommodate 
different models. 

As an example, consider the analysis of Cabre etal (2006) 
who recently cross-correlated the SDSS-DR4 galaxy with 
the WMAP3 CMB anisotropies. Using the JK approach 
with w(9) they estimate a S/N ~ 3.6 for the r = 20 - 21 
sample, which has a mean redshift of z m — 0.33. These 
numbers are high compared with the values in Table 3 for 
z m = 0.33 which for f s k y = 0.13 gives a low S/N ~ 2, even 
for Q.de = 0.8. The dominant contribution to the S/N in Ta- 
ble 3 scales as Cj G / 's/Cf ( ie see Eq.28) and is therefore 
independent of bias, but depends on erg. We have noticed 
that in fact the actual measured values of Cf G / \JC GG in 
the SDSS DR4-WMAP3 maps are almost a factor of 2 larger 
than the values in the concordance Q.de = 0.8 (as = 0.9, 
n = 1, Q.v = 0, Q.b = 0.05, h = 0.7) model. This explains 
the discrepancy in the S/N and illustrates the danger of 
blindly using theoretical errors that are model dependent. 
The discrepancy of the concordance model with the SDSS4- 
WMAP3 measured values of Cj G ' / ' sfCf 3 is not very signifi- 
cant once we account for sampling errors (less than 3-sigma), 
but it could be an indication of new physics that make the 
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P(k) normalization higher than the concordance model, ie 
deviations in as, spectral index n, neutrinos etc, away 
from the fiducial model we are considering. 

We have also shown that it is possible to use the other 
theoretical models (ie TC and TH) to make model inde- 
pendent error predictions from observations. Contrary to all 
other methods, the JK approach does not assume Gaussian 
statistics, but its accuracy could depend on the model or the 
way it is implemented (ie shape and number of sub- regions). 
We conclude that to be safe one needs to validate the JK 
method with simulations, but there is no reason apriori to 
expect that this method is inaccurate. 

In summary, we have presented a detailed testing of 
different error approximations that have been used in the 
literature, both in configuration and harmonic space. Con- 
trary to some claims in the literature (see Introduction) , we 
show that the different errors (including the JK method) 
are equivalent within the sampling uncertainties. By this we 
mean not only that the error and covariance are similar but 
also that they produce very similar signal-to- noise (S/N) 
and recovery of cosmological parameters. 
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Figure Al. Representation of the wab (8) estimator. The product AB is averaged in an ordered way over all pairs of 9-separatcd points 
that belong to the survey area (black points). 

APPENDIX A: COVARIANCE MATRIX AND ERRORS IN CONFIGURATION SPACE 
Al The estimator 

Consider two fields in the sky A(q), B(q') which correpond to one realization of the universe. We want to estimate the true 
two point cross-correlation function of the universe wab by averaging over the sky in the survey area S. The estimator is 

WAB{e)=<A{q)B{q')\ q - ql=9 >s (Al) 
where we average over all pairs separeted by an angle 9 and 9 + A9 in the survey region. This can be put in an integral form 

$ab{9) = -L J s dqdq'\ qql=e A(q)B(q') = — sin \ Mp{e) f dq J* mnOMckpA(q)B{q + 0{<p))D(q,0,<p) (A2) 

where Sn is the normalization factor and the integral is over dq £ S and dp £ (0,2%). As it is illustrated in Fig (Al), 
we integrate all the pairs in an ordered way. First, we fix a point q and sum over all the f?-separated pairs related to this 
point, moving around ip. Since not all the points in the sky ^-separated from q belong to the survey, we introduce a selection 
function D(q, 9, tp) which is one if the second point belong to the survey and zero otherwise. We perform this operation in 
each point of the survey. The origin of <p is not relevant, it could be taken for instance as the direct angle between the #-pair 
and the geodesic line between the q point and the pole. The second integration is over all the points in the survey. 

The normalization factor Sn is a measure of the number of 0-pairs allowed by the survey, which depends on 9. For an 
all sky survey, Sn is 47r27r sin 9A9. The geometry of the survey is enclosed in a multiplicative factor P(9), which is actually 
the ratio between the number of (9-pairs in the survey and the number of S-pairs in the whole sky, i.e., the probability that 
when throwing a #-pair in the whole sky it falls into the survey. When 9 = 0, this probability is equal to the fraction of sky 
f s ky covered by the survey. In an all sky survey, P{9) = 1 and also D(q, 9, tp) = 1, and the estimator for the cross-correlation 
is given by: 

wab{9) = -L- f dqdpA(q)B(q + 9{<p)) (A3) 



A2 Covariance 

In order to get the covariance, we need to relate the estimator of the cross-correlation with the true cross-correlation value. 
The true cross-correlation value wab{9) is the average over realizations of the estimator, (where the estimator wab (9) is 
obtained averaging for all the 9 of the sky). Due to homogeneity, wab (9) is also equal to average any 0-pair of fixed points A 
and B over all the realizations. 

W A b(9) =< WAB(9) >realizatio n = < A(q 1 )B(q 2 ) >realization V<?l52 = 6 (A4) 

wab{9) =< A(q)B(q')\ qq , =e > sky (A5) 
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\R(T) <AxRxrxr» <AB><CxD> <ABxCD> <ABO<D> <ABCD> 

<ABCD> <AxBxCxD> <AC><BxD> <AdxBD& <ABDxC> 

<ADxBxC> <Al5xBC> <ACDxB> 

<BCxAxD> c c <BCDxA> 
<BD><AxC> 

<CD><AxB> 
c 

Figure A2. Writing the four points moment in 15 terms of connected parts. For a gaussian field only the two point terms will remain. 

where <> means averaging over all the realizations from now on. 
The covariance for an arbitrary estimator is 

dj = C^Oj) =< (w(6i)- < w(6 t ) >)(©(*,■)- < w(6j) >) > 

(A6) 

=< (w(0i) - w{ei)){w{ej) - w{0j)) >=< w{ei)w{6j) > - w (0i)w(Pj) 



Thus, for our cross-correlation estimator, the covariance is given by 



dqidq2<lqzdq. 



SN(8i)SN(0j 



dj =< a L / d <li d <l2\ A{q 1 )B{q 2 ) ) [ dq 3 dq 4 \ A(q 3 )B(q 4 ) > -WAB(Oi)w A B(0j) 

(A7) 

< A(q 1 )B(q 2 )A(q 3 )B(q 4 ) > -w A B(Oi)w A B(0j) 

9192 13U=Sj 

What we are doing in eq (A7) is fixing four points in the sky (two ^-separated pairs) and average this fixed configuration 
over realizations of the universe. Then we integrate over all 4-points allowed configurations. The realization average over the 
four fixed points can be simplified and expressed as a function of two-fields-correlations (shown in Fig A2) 

< A(1)B(2)A(3)B(4) >= < A(1)B(2) X A(3)B(4) > + 

+ < A(1)A(3) ><B(2)B(4) > + < A(1)B(4) >< A(3)B(2) > 
where A(i) — A(qi) and B(j) = B(qj), under these two conditions: 

• < A(k) >=< B(l) >= 

• < A(1)S(2)A(3)S(4) > c =0 

Those are very soft requeriments. Regarding the first condition, we can always modify a field with non zero average to one 
with zero average just by substracting its mean (sky averaged) 7 value at each point. The second condition is that the fourth 
connected moment is zero. This is true for a gaussian statistics and always a very good approximation for almost gaussian 
fields. Note that for fields with zero mean, the second moments and the second-connected moments are equal. 

Focus for a moment in the first term of equation (A8). This term has two 0-pairs that are uncoupled. The average over 
realizations will give, for each pair, the cross-correlation value at the corresponding 0, i.e., < A(1)B(2) > < A(3)£>(4) > = 
WAB(9i)wAB{0j). This value is constant for each 4-point configuration, thus when integrating this term, we still get the same 
result. This uncoupled term will cancel the last term in equation A7. Therefore we only have to calculate two terms: 

Cij 



f dqidq 

J s SN(0i 

L 



( dqidq2dq 3 dq4 



SN(0i)SN(6j) 



< A(1)A(3) >< S(2)S(4) > 

(Ag) 

< A(1)S(4) >< A(3)B(2) > 

Q3~U=6j ,9192 =8i 

We have to choose convenient variables to integrate, which will differ slightly for the first and second integral. In Fig 
A3 is shown how to choose the variables. The idea is the following. Let's stay in the first case where we have to integrate 
< A(1)A(3) >< B(2)B(4) >. First, we fix one point in the sky, A(l). Second, we notice that A(l) is related to B(2) because 
they are a #-pair and is related to A(3) because they have to be cross-correlated over realizations. Then we decide to fix distance 

7 here the sky average mean is the estimator for the true mean 
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A(3) 




•P. \ 



B(4) / 



B(2) 




B(4) 



Ml) 



<A(l)A(3)xB(2)B(4)> <A(l)B(4)xA(3)B(2)> 
Figure A3. Variables for integrating equation A9. 



ip to the fourth point S(4). This is the adequate way to desacoplate the integrations partly. Now the cross-correlations pairs 
only depend on two angles ip and one <pi ,i.e, < A(1)A(3) >= waa{4>('4 > , ¥>4: 9j)) an d < B(2)B(4) >= Wbb{4>' '(ip, <fii, #»))■ Here 
<f> and 4>' are the angles between 1,3 and 2,4 respectively. The same idea about which variables to use in the integration is 
applied over the < .A (1)5(4) >< A(3)B(2) > term. 

In order to make our deduction clearer we will follow our explanation for an all sky survey. Afterwards we will comment 
on the case when only a fraction of the sky area is allowed. For all sky survey, we easily separate the two #-pairs having 



v "-tL-) 2 / dqi J 27rsin W d ^ 



+ 



(47T27T) 5 

1 

(47T27I-) 2 



dqi / 2irsin(ip)dip 



" /"2-7T "I r f2-n 

I d(p 4 W A A((f)(lp,<P4,dj)) 

Jo 

[■2tt 

/ d<p 3 WAB(4>(lp,<fi3,0j)) 

Jo 



/ d<piWBB{<t>' {ll>, tpi,9i)) 
JO 



dlflWAB{4>'(lp, lfil,6i)) 



(A10) 



When doing the average over realizations we have lost the dependence on the position of the 4-points configuration and 
only the distances between points remain important. We will get 4-7T from the dqi integration. Also, if preferred, due to the 
symmetry in ip, dtp — * 2 J Q n dtp. 

Using spherical trigonometry, we can relate the angular distance <f> for the cross-correlation wx{4>) with their related 
angles tp, ip and 9. The relation is given by the cosinus law in spherical trigonometry 



cos((j>) — cos(ip)cos(0) + sin(ip)sin(8)cos(<p) (All) 
We arrive to the following equations: 

where X stands for any two field combination AA,AB,BB. When estimating the covariance, the true value of vox has to 
be substituted by its estimated value. 

By construction, the covariance is symmetric in its arguments, i.e, dj = Cji. This symmetry still remains in equation 
(A12) but it is hidden. It remains because we integrated over all four points configurations. It is hidden because of the chosen 
coordinates for the integration. When integrating, we priviledge some points over others. We separate the integral by fixing 
two ?/;-separeted points and integrating over tp angles. If the points chosen to be ^-separated were B(2) and A(3) instead of 
^4(1) and B(4) we would have ended by equation (A12) with 9i <-> dj. Although the symmetry exists, we find convenient to 
put it more explicitly. In equation (A12) we change the kernel to 

K[0i,0j,i>] = ^[w A A(d l ,ip)w B B(dj,i') + WAA(ej,^)WBB{e l ,ip)] + w AB (0 t , v>)^s(^,v) (au) 
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A3 Partial sky survey 

A3. 1 Probability considerations 

In most cases, our survey only have a restricted area of the sky to estimate the cross-correlation signal. If we throw a point, 
the probability to fall into the survey area is the fraction of the sky covered by the survey area fsky We define P(8) as 
the probability that a randomly thrown #-pair in the sky falls (both points) into the survey area. This probability can also 
be understood as the ratio between the number of #-pairs of the survey and the total 6-pairs in the sky. The conditional 
probability P(8/lp) that both points fall into the survey, once we know that one is already inside, is given by 



py/ip) = ^ (Ai5) 

J sky 

We define P(tp, 9, 4>) as the probability that the triangle of sides ip, 8, (j> falls (all) inside the survey area when thrown 
randomly in the sky . The conditional probability that the third point of a triangle falls into the survey, when we know that 
the other two points, ?/>-separeted, are already in, is 

P(+AW = Z*$P (M6) 

It is also useful to remember that Sn(0) = 4n2n sin 6A6P(8) is the normalization factor of the estimator. 

Probabilities for P(8) and P(ip,9,(f>) have to be computed for each survey geometry. In Appendix B we compute those 
probabilities for a polar cap survey, i.e., which contains all points with distances less that r to a given point (the pole). A 
polar cap geometry is a very useful approximation for most cosmological surveys, which are compact and extend to a wide 
area. For those surveys, we can use the probabilites P(8) and P(ip, 8, <f>) in Appendix B. 



A3. 2 The covariance integration 



We are in the case of limited area of the sky, where we have to integrate only the 4-point configuration allowed in this area to 
compute the covariance. We focus in the first term of the equation (A9) that we named I\. We replace the integration over 
the survey configuration by the integration over all configurations convolved with a delta-selection function D which selects 
the configurations in the survey. 



h 



I 

J At; 



dq\dq?,dqzdq± 



Sn(8i)Sn(82 



D(«i,«2,?3,94) < A(1)A(3) >< B(2)B(4) > 



Sn(8i)Sn(82 



I 

Jo 



f fir r2-K 

■ / dq±D(qi) dtp da sin(tp)D(q4)) 
J 4tt Jo Jo 

f 

Jo 



(A17) 



dip a sin 6» 2 A6>2Waa(<XV'! ¥>4, 82))D(q 3 ) 



difii sin 6»iA6>iI(Jbb(</>'(V>, <Pi,9i))D(q2) 



where a is the angle between ^/>-pair and the line from qi to the pole. The other angles are as in figure (A3). In the expression 
above we have formally split D(qi,q2,qz,qA) into four parts D(qi)D(q2)D(qs)D(qA) which in fact have the same meaning: 
they are unity only when all the 4 points are inside the survey and they are zero otherwise. 

This is an exact result for the 7i term. The key point here is to approximate the integrals over D's by replacing those 
selection functions D by a convenient probability. Here we are throwing all 4 points in an ordered way. We throw the first 
point at 91, the selection function D(q\) will select if this point falls into the survey area. The substitution D(q\) — > f s k y 
applies here. Next point to be thrown is §4 which is i/>-related to qi. We substitute D(qA) by P(ip/lp), i.e, the probability that 
once a point is inside the survey area, a second point ^-separated also falls into. The next two points 52 93 are not related 
between them but they are related to the two previous points already thrown. D(q2) and D(qs) have to be substituted by the 
probability that, given two points -(/'-separated inside the survey, a third point is also in the survey at distances 8 and (j> to 
those previous points. D{qz) is substituted by P(ip, 82, (f>/ip), and -D(<j2) by P(ip, 9i, (/>' /ip). When expanding the normalization 
factors and doing the integrals we get: 



h = 



^2-KP{8 z )P{8j)^ ky 



Jo 



dipsin(ip)P(ip/lp) 



p2n 1 r p2ir 

/ d^AWAAWlp^Ajdj^Pilp^j^/lp) / d(^lWi3s(0'(V',¥'l:^))-P(V', 6l i,0' 

Jo J Uo 



7vo 



(Ai8) 



Unfortunately, by replacing D^Pwe (slightly) break the symmetry 8i 



8j . Thus, for the covariance, we will use the 



© 0000 RAS, MNRAS 000, 000-000 



Error analysis in cross- correlation of sky maps 23 



kernel in equation (A14) which will recover this symmetry. Replacing the conditional probabilities calculated in §A3.1 with 
the non-conditional ones we arrive to the final result 



*- «^^ £Wh*» (AI9) 

K[w\ = ^[WAA(9i)WBB(6j) + w A A{e j )w B B{e l )] + w A B{9 i )w AB {9 j ) (A20) 

J 



A3. 3 Small angle approximation 

In the small angle approximation, 9i and 9j are small. We can consider that when one point of the S-pair falls into the survey, 
the other also does. It corresponds to P(tp,9, (f>) — > P(tp), while P(9) — > f s k y - We then have: 

Cij = Ci ' [ f Sky] (A22) 

J sky 

which gives the popular approximation of error scaling as oc . 1 which is used in the TH method in §2.3 and Eq.14. To the 

same level of accuracy, we can also chose to use the equation above together with Eq.A12 and avoid further calculations. As 
we want to go one step further we will give some prescriptions below for more realistic situations. This will not only improve 
the accuracy of the calculation but will also allow us to study when the small angle approximation is good enough in a given 
situation. 



APPENDIX B: PROBABILITIES OF FINDING PAIRS, TRIANGLES, AND POLYNOMIALS IN A 
POLAR CAP SURVEY 

Let it be a given #-pair or an spherical triangle A(#, ■)/>, </>) randomly thrown in a sphere. In this section we compute the 
probabilities of finding them inside a polar cap survey of area A. 

A polar cap of radius r is the union of all points of an sphere with (spherical) distances less than r to a given point (the 
pole). The Area of a polar cap is: 



A = 2ttR 2 [1 — cos(r)] (Bl) 

where R is the radius of the sphere. We set it equal to one as usual in spherical trigonometry. 

The probability of a N-points polygon to be thrown inside a polar cap of radius r is equal to the intersection area of N 
circles of radius r, each one centered in one of the polygon (vertex) points, normalized to (divided by) the total area of the 
sphere, i.e., Air . 

How is it so? The probability for a given poglygon to be thrown inside a circle of radius r (polar cap) already in the 
sphere is the same as the probability of first drawing the polygon in the sphere and then throwing the circle and finding it 
encompassing all N-points (vertex). For this to happen, the center of the circle must be at a distance less than r for any of the 
polygon points. Only those points in the area intersected by N circles of radius r, one from each vertex, hold this condition. 
Then, the probability of finding the polygon inside a polar cap survey of radius r is that area divided by the total area of the 
sphere. 

Bl Probability for a fl-pair: P(0) 

As we have seen, the probability for a #-pair thrown randomly into a sphere to fall inside a circle of radius r (polar cap survey) 
is: 



/intersection area of two cirecles radius r \ 

v separated a distance of 9 ) ,_ . 

P{0) = — (B2) 

In order to compute the area of the intersection A, we make use of the figures Bl and Bl as well as spherical trigonometry 
formulae. When 6 > 2r there is no intersection and the probability P(9) is zero. When 9 < 2r and 2r + 9 < 2n we contruct 
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Figure Bl. Geometry for the intersection of two spherical circles. It is needed for P{6) determination. 






Figure B2. Representation of how to obtain the intersection area of two spherical circles from sectors of spherical circles and spherical 
triangles 

two symmetrical triangles A(8,r,r), as shown in figure Bl. The area of the intersection is given by the sum of two sectors of 
spherical circle minus the area of those triangles. In spherical trigonometry, the area of a triangle is given by the sum of the 
angles between its sides minus it. Thus, 



A = [2<p(l - cos(r))] + [2<p(l - cos(r))] - [2{ip + <p + D - it)] = 2n - 2D - Atp cos(r) 
where D and ip are given by the cosinus law and semiperimeter half angle formulaes 



(B3) 



cos(D) = 



cos(#) — cos 2 (r) 
sin 2 (r) 



( <p. / sin(s — r) sin(s — d) _ /sin(r — 6/2) 

&n( 2 } V sin(s)sin(s-r) ~ V sin(r + 6/2) 



1 + r + r 



(B4) 
(B5) 



When 6 <2r but 2r + 6 < 2-k (and therefore r > tt/2) the two r-circumferences do not intersect each other although the 
two circles area still overlap. The area of the intersection is all the sphere exept the area of the two complementary circles, 
i.e., 



A = 47r - [47r - 27r(l - cos(r))] - [4n - 2n(l - cos(r))] = -4ncos(r) 



(B6) 



B2 Probability for a triangle A(?/>,6»,0) : P(ip,6,cj>) 

As we have seen, the probability for a triangle A(ip, 6, <f>) thrown randomly in a sphere to fall inside a circle of radius r (polar 
cap survey) is: 
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(•intersection area of three circles radius r\ 
POip <f>) = - centered at the vertex )_ (B7) 

When the intersection does not exist P{ip,9,(f>) — 0. This is the case if r < p , where p is the radius of the spherical 
circumference that circumscribes the triangle A(ip,9,(j)). It can be shown 8 that, 

sin^/2) sin(g/2) sin(0/2) 
smipo) — I — : (B8) 

s/(S(S - sin(V'/2)(S , - sin(0/2)(S - sin(0/2) 

where S = sin(i/)/2) + sin((9/2) + sin(0/2) When p < r and po + r < n , the intersection area, ^4, exist, and to compute 
it we can make use of the figures in B2 as well as spherical trigonometry formulae. The intersection area A is delimited by 
three arcs of a circle of radius r. Three points mark the intersection of those arcs in the limiting region. One can contruct 
an spherical triangle A(a, b, c) having those points as vertex. This is the blue triangle in figure B2, which shows all necessary 
angles for this section. Figure B2 also shows how one can get the area A from the triangles an sectors of circles. Following 
this figure the area is 



A = [B(l - cos(r))] - [B' + B' + B - tt] + - cos(r))] - [A' + A' + A - vr] (B9) 
+ [C(l - cos(r))] - \C' + C + C - tt] + [a + f3 + 7 - n] 

= 2tt - cos(r)(A + B + C)-E-r-T = 27r- cos(r)(2E' - 2r' - IT' - <p - w - r) - E - V - T 



where we have applied that angles a, (3, 7, A, B, C can be expressed as a sum of other angles. The angles left can be 
obtained by the spherical cosinus law. We write only three angles here, but the others can be computed in a similar way. 



8 po can be deduced by relating the spherical and the Euclidean triangle with the same points for the vertex. The radius of a circumscribed 
circumference is well known for an Euclidean case. 
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= cos(0)-cosWcosW 
sm(t/>) sin(fl) 
cQ = cos W -cos 2 (r) 

sin 2 (r) 

, w . cos(r) — cos(i/0 cos(r) 

cos ( S ) = • i T\ ■ (\ 

sm(i/>) sin(r) 

When p < r but p a + r > tv the intersection of the three circles exist but we can not contruct such a triangle as in B2. 
The area is given by all sky exept the sum of the intersection areas of the two points complementary circles, i.e., 

A = 4tt - 4tt(1 - P(tp)) - 4tt(1 - P{6)) - 4tt(1 - P (</>)) = 4m[P{%j)) + P(8) + P(<f>) - 2] (Bll) 
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